The Backbone of a City 
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Recent studies have revealed the importance of centrality measures to analyze various spatial 
factors affecting human life in cities. Here we show how it is possible to extract the backbone of 
a city by deriving spanning trees based on edge betweenness and edge information. By using as 
sample cases the cities of Bologna and San Francisco, we show how the obtained trees are radically 
different from those based on edge lengths, and allow an extended comprehension of the "skeleton" 
of most important routes that so much affects pedestrian/ vehicular flows, retail commerce vitality, 
land-use separation, urban crime and collective dynamical behaviours. 



Centrality is a fundamental concept in network analy- 
sis. The issue of structural centrality was introduced in 
the 40's in the context of social systems, where it was 
assumed a relation between the location of an individual 
in the network and its influence in group processes 0. 
Since then, various measures have been proposed over the 
years to quantify the importance of nodes and edges of 
a graph, and the concept of centrality have found many 
applications also in biology and technology Q, IE ^ Hi - 

In economic geography as well as in regional planning, 
centrality has been dominating the scene especially since 
the Sixties and Seventies stressing the idea that some 
places (cities, settlements) are more important than oth- 
ers because they are more "accessible", where accessi- 
bility was intended as a centrality measure of the same 
kind of those developed in the field of structural soci- 
ology, with the difference that the geographic nature of 
elements in space was saved around a notion of metric 
distance 0]. In the field of urban design, a long-term 
effort has been spent in order to understand what ur- 
ban streets and routes would constitute the "skeleton" 
of a city, which means the chains of urban spaces that 
are most important for both the connectedness, liveabil- 
ity and safety at the local scaleH, ll| and its legibility 
in terms of human wayfinding |9j; more recently, these 
latter two approaches are seemingly merging together in 
the first clues of a cognitive/configurational theory [lol |. 
After an in-depth investigation of both the topological 
(dual) ^2 spatial (primal) [T^ fisl l graph represen- 
tation of street networks, in this paper we provide a tool 
for the analysis of the backbone of a complex urban sys- 
tem represented as a spatial (planar) graph. Such a tool 
is based on the mathematical concept of spanning trees, 
and on the efficiency of centrality measures in capturing 
the essential edges of a graph. Differently from previ- 
ous applications of this same concept 0|, we consider 
spatial networks instead of topological ones, so that our 
trees can be shown graphically on the city maps and can 
serve as a support in urban design and planning; more- 
over, we consider two different kinds of edge centrality 
measures, and we compare the obtained trees with the 
standard spanning trees based on minimizing the total 
lengths. 



In our approach, cities are represented as spatial net- 
works (networks embedded in the real space), i.e. net- 
works whose nodes occupy a precise position in a two- 
dimensional Euclidean space, and whose edges are real 
physical connections In such approach, 1-square 

mile samples of urban street patterns selected from 
Ref. 01 are transformed into spatial undirected graphs 
by mapping the intersections into the graph nodes and 
the roads into links between nodes UM- Here we 
will focus, in particular, on the cities of Bologna and San 
Francisco as examples of two different classes of urban 
street patterns, the former being a self-organized organic 
network evolved over a long period of time through the 
uncoordinated contribution of countless historical agents 
while the latter being a mostly planned fabric built in a 
relatively short period of time following the ideas of one 
coordinating historical agent. Each of the two obtained 
graphs is denoted as G = G{N, K), where N and K are, 
respectively, the number of nodes and links in the graph. 
In the case of Bologna we have N = 541 and K = 773, 
while in the case of San Francisco the same amount of 1- 
square mile of land contains only N = 169 and K = 271 
edges. The average degree < k >= 2K/N is respectively 
equal to 2.71 and 3.21. This difference is due to the 
overbundance of three-roads intersections with respect 
to four-roads intersections in the city of Bologna. The 
converse is true for the city of San Francisco, due to its 
square-grid structure. See Ref. for a plot of the entire 
degree distributions in the two cases. The graph nodes 
are characterized by their positions in the unit square 
{xi, yi}i=i^...^jv, while the links follow the footprints of 
real streets and are associated a set of real positive num- 
bers representing the street lengths, {la}a=i,...,K- An- 
other relevant difference between the two cities is cap- 
tured by the edges length distribution. In Fig. we plot 
n{l), the number of edges of length as a function of 
I. The edges length distribution has a single peak in 
Bologna, while it has more than one peak in a mostly 
planned cities as San Francisco, due to its grid pattern. 
In the following, the graph representing a city is described 
by the adjacency N x N matrix A, whose entry Uij is 
equal to 1 when there is an edge between i and j and 
otherwise, and by a x TV matrix L, whose entry Uj is 
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the value associated to the edge a = in our case 

the metric length of the street connecting i and j. 
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FIG. 1: Top panels: the length distributions for the two cities 
of Bologna and San Francisco (full lines) are compared with 
the length distributions of the respective betweenness-based 
MCSTs (dashed lines). The quantity n(Z) is defined as the 
number of edges whose length is in the range [1-5 meters, 
1 + 5 meters]. Bottom panels: cumulative distributions of 
edge betweenness (left) and information (right) for 
Bologna (circles) and San Francisco (squares). The dashed 
lines in the left panel are exponential fits to the betweenness 
distributions. 

In a pre vious work , different measures of node cen- 
trality 0,0, properly extended for spatial graphs, have 
been investigated in the same database of urban street 
patterns. Here we show how to construct spanning trees 
based on edge centrality. We first localize high central- 
ity edges, namely the streets that are structurally made 
to be traversed (betweenness centrality) or the streets 
whose deactivation affects the global properties of the 
system (information centrality). Of course other defini- 
tions of edge centrality (as for instance range, closeness 
or straightness _18j) can be used as well. The definitions 
of edge betweenness and edge information we adopt are 
obvious modifications of the centrality measures defined 
on nodes. 

The edge betweenness centrality, , is based on the idea 
that an edge is central if it is included in many of the 
shortest paths connecting couples of nodes. The between- 
ness centrality of edge a — 1, ...,K is defined as po| : 
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where rijfe is the number of shortest paths between nodes 
j and k, and njk{a) is the number of shortest paths be- 
tween nodes j and k that contain edge a. 
The edge information centrality, C^, is a measure relat- 
ing the edge importance to the ability of the network 
to respond to the deactivation of the edge itself. The 
network performance, before and after a certain edge is 
deactivated, is measured by the efficiency of the graph G 



|2lL |22| . The information centrality of edge a is defined 
as the relative drop in the network efficiency caused by 
the removal from G of the edge a 0, : 
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where the efficiency of a graph G is defined as: 
1 
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and where G" is the graph with nodes and K —1 edges 
obtained by removing edge a from the original graph G. 
An advantange of using the efficiency instead of the char- 
acteristic path length L 24] to measure the performance 
of a graph is that E[G] is finite even for disconnected 
graphs. 

In Fig. n we report the cumulative distributions of edge 
betweenness and information. The cumulative distribu- 
tion P{G) is defined as: 
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where n(C) is the number of edges with centrality equal 
to G. The edge distributions are quite similar in the two 
cities of Bologna and San Francisco. In particular, the 
betweenness distributions are well fitted by exponential 
curves, P(C^) exp{~G^ /s), with coefficients respec- 
tively equal to sbo — 0.020 and ssf — 0.029. Thus, for 
the edge betweenness, the distributions found are simi- 
lar (single-scale) to those observed for the node between- 
ness. Conversely, the edge information distributions have 
not a well defined shape: although their decay is slower 
than exponential in both Bologna and San Francisco, the 
edge information distributions do not allow to differen- 
tiate self-organized cities from planned ones, as it was 
instead possible by means of the node information dis- 
tributions 12]. This indicates that there are important 
correlations in the information centrality of edges inci- 
dent in the same node. This also indicates that organic 
self-organized cities are different from planned ones, more 
in terms of their nodes (intersections) than of their edges 
(streets), and especially about how they assign impor- 
tance to such spaces. 

We are finally ready to build the Maximum Centrality 
Spanning Trees (MCSTs), i.e. maximum weight span- 
ning trees where the edge weight is defined as the central- 
ity of the edge. A graph G'{N' , K') is a tree if and only if 
it satisfies any of the following four conditions: 1) G' has 
A^' — 1 edges and no cycles; 2) G' has iV' — 1 edges and 
is connected; 3) exactly one simple path connects each 
pair of nodes in G"; 4) G' is connected, but removing any 
edge disconnects it. Given a connected, undirected graph 
G{N, K), a spanning tree T is a subgraph of G which is a 
tree and connects all the TV nodes together. Consequently 
T = T{N, TV— 1). A single graph can have many different 
spanning trees. We can also assign a weight Wa to each 
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edge a, which is usually a number representing how fa- 
vorable (for instance how central) the edge is, and assign 
a weight to a spanning tree by computing the sum of the 
weights of the edges in that spanning tree. A maximum 
weight spanning tree is then a spanning tree with weight 
larger than or equal to the weight of every other spanning 
tree of the graph. It appears evident that it is possible to 
define appropriate edge weights with the aim of finding 
particular structures capable of connecting every single 
node of the graph while minimizing the corresponding to- 
tal weight. In particular, for each city we have computed 
two different MCSTs, respectively based on betwenness 
and information. The two cases are obtained by respec- 
tively fixing Wa = and Wa — C^, with a = 1, K. 
Since the two centrality measures focus on different prop- 
erties of the network, using both of them allows us to 
enforce our analysis. Moreover, as shown in Fig. [5] left 
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FIG. 2: Scatter plots showing the correlations between edge 
betweenness and edge information in Bologna (top panels) 
and San Francisco (bottom panels). Each point represents 
one edge in the orginal graph (left), in the betweenness-based 
MOST (center), and in the information-based MOST (right). 

panels, and are correlated, although it is possi- 
ble to find edges with a low value of and a high 
(and viceversa). The coefRcients of linear correlation are 
respectively equal to r = 0.69 and r = 0.46. For the 
computation of the MCSTs (and of the mLSTs) we have 
used the Prim's algorithm ^5] that allows to obtain the 
result in a time proportional to K\ogN. The MCST for 
the city of Bologna contains K' = N — 1 = 540 links, i.e. 
70% of the links of the original graph, while the MCST 
for San Francisco has K' = 168, i.e. 62% of the links 
of the original graph. Since the links have been chosen 
according to their centrality values, it turns out that the 
set of selected edges in the betweenness-based MCST of 
Bologna (San Francisco) possesses the 86% ( 82%) of the 
total betweenness centrality of the original graph, defined 
Sa=i K 0- Similarly, the set of selected edges in 
the information-based MCST of Bologna (San Francisco) 
possesses the 84% (95%) of the total information central- 



ity. This is both due to the shapes of the centrality dis- 
tributions shown in Fig. ^ and to the edge selection that 
avoids, in the tree construction, the formation of cycles. 
The values of and for the selected edges are shown 
in the scatter plots of Fig. [5] In the case of Bologna, the 
two measures of centrality have the same correlations as 
in the original graph (the correlation coefficients in the 
MCST are = 0.61 and = 0.64). Conversely, in San 
Francisco, the two variables are less correlated in the MC- 
STs (r^ = 0.10 and = 0.29) than in the original graph 
(r ~ 0.46). In Fig. ^ (top panels) we have plotted the 
edge length distributions of the betweenness-based MC- 
STs (dashed lines). It is interesting to observe that, for 
the city of Bologna, n{l) has the same shape both in the 
original graph and in its betweenness-based MCST. This 
means that, in the construction of the tree, edges with all 
lengths have been removed (with same probability) from 
the original graph. Conversely, in San Francisco most of 
the edges not included in the betweenness-based MCST 
are those with the largest length. The same result has 
been found for the information-based MCSTs and seems 
to be a common characteristic of other planned grid-like 
cities. 

In Figl^lwe compare graphically the two MCSTs with 
the minimum length spanning trees |25| . In the construc- 
tion of the latter, the weight Wa associated to each edge 
a is set to be equal to the length of the edge la and rep- 
resent the cost of the edge. A Minimum Length Spanning 
Tree (mLST) is then a spanning tree with weight (cost) 
smaller than or equal to the weight of every other span- 
ning tree of the graph. The MCSTs obtained are different 
from the mLSTs. In the case of Bologna, the between- 
ness (information) based MCST has a total length equal 
to 1.15 (1.14) times the total length of the mLST, while 
in the case of San Francisco this ratio is equal to 1.15 
(1.07). In the case of Bologna, the MCST based on be- 
tweenness (information) has 82% (75%) of the edges in 
common with the mLST, while in San Francisco it has 
70% (76%) of the edges in common with the mLST. It 
is worth noting that the two MCSTs have 77% of the 
edges in common in Bologna, whereas such a percentage 
is smaller in San Francisco (66%). The graphical visu- 
alization of the maximum centrality trees is of interest 
for urban planners since the trees express the uninter- 
rupted chain of urban spaces that serves the whole sys- 
tem while maximizing centrality over all edges involved. 
This method identifies the backbone of a city system as 
the sub-network of spaces that are most likely to offer 
the highest potential for the life of the urban community 
in terms of popularity, safety and services locations, all 
factors geographically related with central places. This is 
evident in Fig.|21 where the comparison between the trees 
in the two cities clearly indicates that the spatial sub- 
system that keeps together a city in terms of the shortest 
trip length is not the same spatial sub-system that does 
it in terms of the highest centrality. It is also worth not- 
ing that metric distance is also involved in the algorithms 
for the calculation of centrality indices, so that all kinds 




FIG. 3: Spanning trees of Bologna (above) and San Francisco (below). From left to right, mLSTs, betweenness-based and 
information-based MCSTs 



of trees considered hereby are rooted in the geographic 
space. The second thing is that while the shortest length 
bac;kbonc performs effectively when applied to planned 
urban fabrics like San Francisco, in self- organized evolu- 
tionary cases like that of Bologna it does not individuates 
continuous routes nor clearly distinguishes a hierarchy of 
sub-systems in the network, while the highest informa- 
tion and especially the highest betweenness backbones 
do. In a way, we would say that organic patterns are 
more oriented to put things and people together in pub- 
lic space than to shorten the trips from any origin to 
any destination in the system, this latter character being 



more typical of planned cities. 

In conclusion, in this work we have shown that the 
concept of MCST leads to a meaningful picture of the 
primary sub-system of a city network, which makes it 
a single component while minimizing the cost of mov- 
ing around and maximizing the potential of places to 
achieve social success, safety and popularity. Therefore, 
the method has the potential of becoming an useful tool 
in city planning and design, due to its immediate and 
powerful visualization outcome. 
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